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Abstract — The task of compressing classical information in the 
one-shot scenario is studied in the setting where the decompressor 
additionally has access to some given quantum side information. 
In this hybrid classical-quantum version of the famous Slepian- 
Wolf problem, the smooth max-entropy is found to govern the 
number of bits into which classical information can be com- 
pressed so that it can be reliably recovered from the compressed 
version and quantum side information. Combining this result 
with known results on privacy amplification then yields bounds 
on the amount of common randomness and secret key that can 
be recovered in one-shot from hybrid classical-quantum systems 
using one-way classical communication. 

Index Terms — quantum information, data compression, 
Slepian-Wolf coding, smooth entropies 

I. Introduction 

INFORMATION processing tasks, be they classical or quan- 
tum, are typically studied in the setting of asymptotically 
many independent and identically-distributed (i.i.d.) resources. 
Recent research has however extended our understanding to 
the one-shot setting in which the resources are essentially 
arbitrary and structureless. Various protocols have been stud- 
ied, such as extracting uniform randomness from a classical 
random variable, extracting randomness uncorrelated with 
possibly quantum adversaries (privacy amplification), as well 
as quantum data compression, state merging, entanglement 
distillation, and channel coding (see [TJ for an overview on 
classical protocols, (2) for an overview on quantum schemes 
based on decoupling, as well as Q, JH, 0, 0, (7) for cor- 
responding results related to entanglement manipulation and 
channel coding). In this generalized setting, the Shannon or 
von Neumann entropies, which are normally used to quantify 
the strength of the available resources, need to be replaced by 
smooth entropies, first introduced for the classical case in (8), 
JT|, and subsequently extended to the quantum case for the 
conditional and unconditional entropies in (9), iflOl and the 
relative entropy in IfTTI . 

Here, we present one-shot results for the tasks of classical 
data compression with quantum side information and distil- 
lation of common randomness or shared secret keys using 
one-way communication. Our results show that the relevant 
measure for characterizing the available resources is again 
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the smooth entropy, in accordance with the aforementioned 
earlier findings. This confirms that, despite the generality of 
the one-shot approach, it is possible to formulate a variety 
of information-theoretic resource (in)equalities in terms of 
a single type of entropy measureQ The situation is thus 
analogous to the standard i.i.d. -based theory, where the von 
Neumann entropy (which can be seen as a special case of the 
smooth entropy lfl2ll ) takes this role. 

The problem of classical data compression with quantum 
side information at the decoder is a hybrid classical-quantum 
version of the famous Slepian-Wolf problem |fl3ll . and was first 
studied in the asymptotic i.i.d. scenario by Winter fBl and De- 
vetak & Winter 1 15 1. There it is found that the classical random 
variable X can be compressed at a rate given by the classical- 
quantum conditional entropy H(X\B) = H(XB) - H(B) 
when the quantum system B is available to the decoder. 
Here H(-) is the von Neumann entropy, and classical random 
variables are treated as quantum states diagonal in a fixed 
basis. We show that in the one-shot scenario the classical 
random variable can be compressed to a number of bits 
given by the smooth conditional max-entropy, and that this 
amount is optimal, up to small additive quantities involving 
the smoothing parameter. 

We then combine this result with known results on ran- 
domness extraction and privacy amplification to characterize 
protocols for both common randomness distillation and shared 
secret-key distillation from hybrid classical-quantum states 
in protocols using one way communication from the party 
holding the classical variable X to the party holding the 
quantum system B. This task is relevant for post-processing 
in quantum key distribution protocols. Moreover, these two 
"static" problems are closely related to the "dynamic" tasks 
of transmitting classical information in public or private over 
a quantum channel. One shot results have been derived for 
the former case in 0, 0. In Q we use the static protocols 
described here to directly construct optimal protocols for both 
public and private communication over quantum channels. 

The paper is organized as follows. In the next section we 
describe the three tasks under consideration more concretely, 
give the definitions of smooth entropies as used here, and 
state our main results. The following section is then devoted 
to the proofs. Finally, we discuss some open questions and 
applications of this result. 

'Note that the smooth entropy comes in two versions, the smooth min- 
entropy and the smooth max-entropy. They are however dual to each other 
(see Eq. (4) and subsequent discussion). 



2 



II. Definitions and Main Results 

Let us begin by describing the task of classical data com- 
pression with quantum side information at the decoder. Sup- 
pose that one party, Alice, holds a classical random variable X, 
while a different party, Bob, holds a quantum random variable, 
i.e. a quantum system B. The task of data compression with 
side information is for Alice to encode X into another random 
variable C, such that Bob can reliably recover X from C 
and B. Clearly Alice could simply send X itself, so we are 
interested in how small C can be made in principle. We assume 
that the random variable X, as well as the state space of 
system B, are finite. The two random variables are defined 
by the ensemble {p x ,ip x } x eX> where X is the alphabet over 
which X is defined, p x is its probability distribution, and ip B 
is the density operator of system B when X takes the value 
x. In the following, we will describe this ensemble by the 
classical-quantum (cq) state ijj XB = Ylxex Px\x){x\ x £§> ip B ; 
the compressed version of X can be included by appending a 
system C. 

A protocol is specified by the encoding map £ : X — > 
{0, l} m and the decoding map V : S(H B ) x {0, l} m -> X, 
where m = log 2 |C| and S(H B ) is the set of density operators 
on the state space for system £>@ (All logarithms are to be 
understood as base 2 in what follows.) The decoder generally 
consists of a quantum-mechanical measurement on system B, 
conditioned on the value of C. This takes the form of a POVM, 
a collection of positive operators A B C such that ^-x-c = 1 B 
for all c. Therefore, the decoder is generally probabilistic. The 
protocol (£,2?) is said to be e-reliable or e-good when the 
average error probability is not greater than e: 



Pc 



x£X 



^x;£(x)fx 



< e. 



(1) 



Note that if we call X' the output of the decoder, the 
probability of error is equal to the variational distance (the 
trace distance of classical random variables) of p x ,x' to the 
ideal output p x 5 x<x >: Pen = \ J2 x ,x' \Px$x,x> ~ Px,x>\- Finally, 
we denote by i* nc (X\B)^ the smallest achievable size of 
log |C| for an e-good protocol applied to the state ip XB . 

The two tasks distilling either common randomness or a 
shared secret key are very much related to the data compres- 
sion problem. The goal now is for Alice and Bob to not only 
end up each holding a copy of the random variable X, but to 
further transform this into either shared uniform randomness 
or uniform randomness uncorrelated with a system E held by 
an eavesdropper. Again information C is sent from Alice to 
Bob, who each then generate new classical random variables 
Ka and Kb such that Ka = Kb from X and (B, C), 
respectively. In each case we also demand that these outputs 
are uncorrelated with C. Common randomness distillation is 
a special case of secret key distillation with trivial E, so we 
focus on secret key distillation in what follows. 

The quality of the output can be measured by the trace 
distance to the ideal state. The trace distance D(p,a) for two 



states p and a is defined by D(p,a) 



\p — a\\i, where 



2 Generally one may consider arbitrary sets for the output of the encoding 
map, not just those of size 2 m . We do this here for simplicity. 



WMi = Tr 
and Kb of a 



V A^A for arbitrary A. The output pair Ka 
protocol exchanging information C is called an 
e-good secret key against E if the output state p K AK B CE j s 
such that D(p KaKbCE , k KaKb ®p CE ) < e, where k KaKb = 
TW1 1^) (^'l^ 4 ® \k) (k\ Ks . The number of e-good common 
random bits that can be distilled in this manner starting from 
a shared state ip XB we call t SCCI (X; B\E)^,. 

Our goal is to bound the quantities £l nc (X\B)^ and 
£l eci (X; B\E)^p in terms of the smooth min- and max- 
entropies. First, the conditional max entropy for a state p AB 
is defined by 

H m ^(A\B) p = max 2\ogF(p AB , 1 A ® a B ), (2) 



where the maximization is over positive, normalized states a 
and F(p,a) = \\y/p\^\\i is the fidelity of p and a. Dual to 
the conditional max-entropy is the conditional min-entropy, 



H min (A\B) p ee max (- log A min (/ B , a B )) 



(3) 



with X min (p AB ,a B )= min {A : p AB < \1 A ® a 3 }. The two 
are dual in the sense that 

H max (A\B) p = -H min (A\C) p (4) 

for p ABC a pure state 02). 

Each of these entropies can be smoothed by considering 
possibly subnormalized states p AB in the e-neighborhood 
of p AB , defined using the purification distance P(p,a) = 
v/1-F( P ,ct)2, 

B t (p) = {p:P(p,p) <e}. (5) 

Note that the purification distance is essentially equivalent to 
the trace distance, due to the bounds D(p,a) < P(p,a) < 
y/2D(p, a) fT7l . The smoothed entropies are then given by 

HLMB) P = _ max H mia {A\B) p , 



peB c ( P AB ) 
M\B) P ^ _ minH max (A\B) p . 



(6) 
(7) 



p€B e (p AB ) 

Furthermore, the dual of H^ &x (A\B) p is H^ in (A\C) p , so that 
taking the dual and smoothing can be performed in either 
order 07). 

Now we can state our main results. 

Theorem 1 (Classical Data Compression with Quantum Side 
Information at the Decoder). Given any e > and state 

^ XB = ExPAx)(x\ x ®y B , 

(.l nc {X\B)^ < H^(X\BU + 21og^ + 4, 

for ei, e2 > such that e = ei+e2. 

Theorem 2 (Secret Key Distillation with One-Way Public 
Communication). Given any e > and a state ip XBE = 
T, x Px\x)(x\ x <g> <p BE and e=ei+e 2 , e'=ei+e2, 



sccr 



.(X-B\E)^> sup H^UIEV^-H^UIBV)^ 



(U,V)<-X 



41og^ 



t sccr {X;B\Eh < sup H^(U\EV).^H^(U\BV)^. 
(u.y)^x 
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III. Proof of the Main Results 

A. Data Compression with Quantum Side Information 

To prove the lower bound of Theorem [TJ often called the 
direct part, we exhibit a protocol achieving it. The idea is for 
Alice to sufficiently narrow the set of states ip B which Bob is 
attempting to distinguish by providing him with the informa- 
tion £(x). Our protocol makes use of 2-universal hashing by 
the encoder and a variant of the pretty good measurement ifTHl 
by the decoder. A family of hash functions f : X — > {0, l} m 
is called 2-universal if, when choosing a function randomly 
from this family, the probability of collision, f(x) = f(y) 
for x y, is at most the same as for random functions: 
Pr/[/(x) = f(y)] < l/2 m 0D. The proof proceeds on the 
basis of the following two lemmas. The first is a bound on the 
error probability based on Lemma 2 of l20l . 

Lemma 1. Let ip XB 

cq state with ip B = Y] 
hash functions f : X — > {0, l} m , and P XB be an operator of 
the form P XB = 2~2 x &x \ X )( X \ X ® n f with 0<Tl x <lfor 
all x € X. Then there exists a family of measurements on B 
indexed by f £ F and c 6 {0, l} m and having elements A B C ^ 
corresponding to outcomes x, such that A x i. c j = when 
f(x') ^ c, and for which the error probability p err averaged 
over a random choice of f £ F obeys 



2~2xexPx\ x )( x \ ®fx be an arbitrary 
p x f B , F be a 2-universal family of 



Pc 



^5>Tr[(l-Af 



f,x 



< 2Tr [{t-P XB )i> XB ] +4 • 2-" l Tr [P XB (l x ®(p B )] . 

Proof: The measurement on B is defined by the pretty- 
good measurement using all the IL such that f(x) = c. It has 
elements 



\x':f(x')=c ) \x':f(x')=c 

when f(x) = c and otherwise. Using Lemma 2 of |20l we 
can "unravel" these to obtain 

1 - A X] c,f < 2(1 - II X ) +4 S f(x>),cRx'- (8) 

x'^x 

Next, consider the error probability for a given x and /, and 
therefore c = fix): 

Peix(x, f) = Tr [(1 - A x .j( x) j)ip x ] 

< 2Tr[(l-n,V,] + 4^ /(x , )J(x) Tr[n x ^ x ] . 

x'^x 

Averaging over / and using the 2-universal property simplifies 
the second term: 

p err <2Tr[^ x (l-n x )] 

+ 4^Pr / [/(x') = /(x)]Tr[ Vx n x ,] 

x'^x 

< 2Tr [<^(1 - ll x )] + 4 • 2- m Tr [c^LL.] 

x'^x 

< 2Tr [tp x (l - n x )] + 4 • 2~ m Tr [<PM ■ 

x'ex 



Now average over x to get 



xex 



x'ex 



Using the form of P XB completes the proof. ■ 
The second lemma is a corollary of a result proven in the 
context of hypothesis testing by Audenaert et al. lETft . Il22l 
(in particular, see Eq. 24 of J22)). Here {A}+ denotes the 
projector onto the support of the positive part of A and {A}- 
the nonpositive part. 

Lemma 2. For p, a > and any < s < 1, 

Tr [p{p - a}_ + a{p - a} + ] < Tr [pV~ s ] . (9) 

Proof of the Direct Part of Theorem [7J Let P XB = 
{iP XB - 2- ( - m ~ 1 H x ® cp B }+. Combining Lemma [TJ and 
Lemma [2] with s = h, the bound on the error probability 
becomes 



p crr < V8 • 2" m Tr 



< V8 ■ 2~ m 



yJi> XB yJ\ x ®<p B 
< V8 • 2- ,n max F(ifj XB , l x <g> 

= ^8 • 2-( m - ff max(^|S)^). 



(10) 

(11) 

(12) 
(13) 



The second inequality is an immediate consequence of 
an alternate expression for the trace distance, ||^4||i = 
max(/ |Tr[/7A] | for unitary U, which can be seen by using 
the polar decomposition A = VyA^A. Choosing m = 
\H m ^{X\B)^ + 2 log f| + 3 then implies p m < e. 

Now consider constructing a protocol for a nearby state 
ip £ B ei (ij)) and suppose it achieves an error probability £2- 
Since the trace distance is upper bounded by the purification 
distance, the error probability achieved by the protocol when 
applied to ip itself will not be more than t\ + £2- Choosing 
a cq -0 minimizing the max-entropy, which can be done by 
virtue of Lemma [3] in the Appendix, it follows that we can set 



m=r#rnU*l^ + 21og-M+3 



(14) 



and achieve this error probability. Since an error rate of e = 
61 + 62 can be achieved by selecting a hash function / at 
random, there must exist one such function whose error rate 
does not exceed e. Finally, using \x] < x + 1 completes the 
proof. ■ 
Proof of the Converse of Theorem [7} The converse rests 
on the fact that the max-entropy of X given BC must be 
small if X is recoverable from B and C. In fact, p clI < e 
implies H^^(X\BC)^, < 0. To see this, suppose we apply 
the protocol generating the guess X' of X from BC. This 
is a quantum operation, and therefore the max-entropy cannot 
decrease (by Theorem 18 of |T7|), meaning H^^XlBC).^ < 
H^ ax (X\X')jf,' . But since p m < e, it follows that the 
ideal output must be an element of B^^((ip') xx ) by the 
bounds between trace and purification distances ifTTl . Thus, 
H^(X\X%, =0. 

Now select ip XBC <E B ^(4> XBC ) to minimize 
H^(X\BC) 4 ,. By the chain rule H max (X\BC)^ > 



4 



H. 



have 



log \ C\ of Lemma [4] in the Appendix, we 



log |C| > H max {X\B)^ > H^(X\B)^, (15) 
completing the proof. ■ 

B. Common Randomness and Secret-Key Distillation 

By combining the data compression result with known 
results on randomness extraction and privacy amplification, we 
can easily construct one-shot protocols for distilling common 
randomness or secret-keys. 

Recall from Jg), OU, ED, E3 that privacy amplification of 
the random variable X against an adversary holding a possibly 
quantum register E can yield a number of e-good random bits 
£g Xt (X|i?),0 in accordance with the following bounds, where 
tp XE = ^ x Px\%){x\ X ® fx- Note that previous results used 
the trace distance to define the smoothing, which accounts for 
the slight difference in the form of the upper bound given here. 

Theorem 3 (Privacy Amplification 0, fflOl, El, El). For 

any ei + e 2 = eJU 



H £ An(X\EU - 21og£ + l < t c ^{X\E)^ < H^(X\E) 



To distill a secret key from a state ip 



XBE 



(fix , in principle Alice and Bob need only 



first run the data compression scheme and then perform 
privacy amplification as in Theorem[3] If they require the result 
to be uncorrelated with the classical message C, this can be 
simply lumped together with E as defining the adversary. For 
such a two-step protocol, the overall approximation parameter 
will consist of a sum of the parameters of the various parts, by 
the triangle inequality of the trace distance. In the following, 
e will denote the error in data compression, e' the error in 
privacy amplification. 

Proof of the direct part of Theorem^ To prove the lower 
bound of Theorem [2] we start by ignoring the supremum over 
functions taking X to (U,V). From Theorem [3] we have 

t s +l(X;B\E) 4 , > H^{X\CE) 4 , - 2 log ± + 1 (16) 

Now we may simplify the righthand side by using Lemma [5] 
of the Appendix (a slight modification of the chain rule part of 
Theorem 3.2.12 of |9)), which in the present context translates 
t0 H%JXC\E) < H: nhl (X\CE)+ log \C\. Then, since C is 
a deterministic function of X, it follows that Hf llin (XC\E) = 
H^ in (X\E). On the other hand, from Theorem Q] we have 
log |C| < H^(X\B)^ + 2 log £ + 4, meaning 



Ccr (X;B\E)^ > HX(X\Eh-H^(X\B\ 
-21og-^-3. 



(17) 



Finally, the bound can be immediatly improved by consid- 
ering preprocessing in which Alice first computes U and V 

3 The extra +1 in the lower bound comes from rounding and the fact that (9) 
uses a slightly different distance measure. 



from X, and publicly distributes V to Bob (meaning Eve also 
obtains a copy). This yields 



(X-B\E)> sup H^ n (U\EV)^- 



-H^(U\BV) 4 , 



-2 log 



1 

62 £n 



(18) 



Choosing e 2 = e 2 completes the proof. ■ 
Proof of the converse of Theorem [2} Now we consider 
the converse for a generic key distillation protocol in which 
Alice generates U and V from X, broadcasts the latter and 
uses the former as the key, while Bob generates his version of 
the key U' from B and V. Let the cq state ip XBE be the input 



to this process and (ip') 



f\UU'VE 



be the output and suppose that 



the latter is an e-good approximation to a secret key of size 
n = £l ecr (X; B\E)$ bits. As in the proof of the converse to 
Theorem Q] this implies H^(U\U'V)^ < 0, and similar 
reasoning implies H^(U\EV)^ > n. Thus, 



n < H ] 
< H, 



2< 
min 

27 
'min 



(U\EV) 
(U\EV) 



h£*(U\U'V) 
H^{U\BV) 



< sup 

(U,V)i-X 



H^(U\EV) - H^(U\BV) 



(19) 
(20) 
(21) 



Here the second inequality follows from the non-decrease of 
the max-entropy under quantum operations (the data process- 
ing inequality), Theorem 18 of IfTTll . ■ 

IV. Conclusions 

By characterizing the one-shot capabilities of data com- 
pression and secret key distillation in terms of smooth min- 
and max-entropies, we provide further evidence that a useful 
general theory of one-shot protocols does indeed exist and 
does not require the definition and study of new quantities for 
each individual protocol. 

Our results may also be specialized to the case of 
asymptotically-many non-i.i.d. resources, and we find expres- 
sions for data compression and secret-key distillation in terms 
of spectral entropy rates, as introduced by Han & Verdu 1231 . 
Il26l for the classical case and generalized by Hayashi & 
Nagaoka |20 | to quantum information. For instance, inserting 
the result of Theorem [T] into the correspondence formulas 
derived in 11271 . we immediately find an expression for optimal 
data compression in terms of a spectral entropy rate, which 
complements a result on data compression derived in 11281 
(the latter applies to a setting without side information, but 
where the data to be compressed is quantum-mechanical). 
Furthermore, we note that the known expressions for data 
compression and secret-key distillation in the i.i.d. case can be 
readily recovered from our results by virtue of the Quantum 
Asymptotic Equipartition Property lfl2l . 

We may also immediately infer two entropy relations from 
our results, a chain rule for max-entropies and an uncertainty 
relation similar to the one given in 1 29 1 . To derive a chain 
rule, consider the problem of compressing a joint classical 
random variable XY, with quantum side information available 
at the decoder. One way to construct a compression protocol 
for XY is to first compress X in a way suitable for a decoder 
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with access to B and then to compress Y for a decoder with 
access to XB. Clearly this will not be better than the optimal 
protocol. 

Suppose that the first step succeeds in identifying X with 
average error probability e x . We may imagine that the de- 
coder coherently performs the appropriate measurement on 
B and stores the result in an auxiliary system X'\ using 
the purification of the input cq state t/j XB it is then easy to 
show that the actual result of this process, the state £ x B 
is essentially the same as the ideal output £ xx B in which 
X' is simply a copy of X and the side information B is 
untouched. In particular, including the random variable Y, we 



XX'YB 



XX'YB 



have i 

Since the side information in B is essentially unchanged, 
it can subsequently be used to help determining Y. Now 
let e y be the average error probability for a compression 
scheme of Y given an exact copy of X at the decompres- 
sor, i.e. the input described by £ xx YB . Using Y' to store 
the output of the measurement, the triangle inequality and 
contractivity of the trace distance under partial trace implies 



- e y , where £ 



XX'YY'B 



1 ^XX'YY' _ ^XX'YY' 

again the ideal output in which X' = X and Y' = Y . Working 
out the trace distance for states of this form reveals that it 
simply equals the error probability, so the total probability of 
incorrectly determining X and Y is no greater than y/2e x + e y . 
After setting e x = e y = 2e and e' = 2(e + s/e), Theorem Q] 
implies 

H^{X\B) 4 , + H^(Y\XB) 4 , > (XY\B)^ 

-41ogi-8. (22) 

Another immediate application of our results is the deriva- 
tion of an uncertainty relation similar to the one given in |29l . 
In contrast to the proof in |29l , the derivation makes use of 
the operational meaning of these quantities. A recent result 
by one of us shows that, just as min- and max-entropy are in 
some sense dual, protocols for privacy amplification and data 
compression are dual, too 113011 . Specifically, a linear protocol 
for data compression with side information can be transformed 
into a linear protocol for privacy amplification, and, under 
certain conditions, vice versa. Thus, we can start with a data 
compression protocol operating in accord with Theorem[T]and 
transform it into a privacy amplification protocol, at which 
point it is subject to the constraints of Theorem [3] The end 
result of this analysis, carried out in more detail in 11301 . is the 
following uncertainty relation, valid for arbitrary e > 0: 



H E min (X A \R)^ + H^(Z A \B)^ > log 2 d - 8 log ■ 



12. 
(23) 



Here A is a system of dimension d, held by Alice. She can per- 
form one of two measurements, corresponding to the two bases 
which are eigenbases of the operators X A = 2~^fc=o 
and Z A = J2iZlu k \k)(k\, for uj = e 2 "/ d . B and R are 
additional (quantum) systems whose purpose is to help predict 
the outcomes of hypothetical Z A and X A measurements, 
respectively; the capability of one constrains the capability 
of the other. 



We have only concentrated on protocols whose ultimate aim 
is to process classical information, albeit perhaps stored in 
quantum states; protocols manipulating quantum information, 
such as in entanglement distillation, are not considered. How- 
ever, there exists a strong connection between the two, at least 
in the asymptotic i.i.d. scenario, and it could be fruitful to 
extend this connection to the one-shot case. 

Taking the case of entanglement distillation, the first proof 
of the achievable distillation rate proceeds by first establishing 
the achievable rate of secret-key distillation and then showing 
that coherently performing the protocol results in an entan- 
glement distillation protocol |[3T| . An equivalent distillation 
protocol can be constructed by combining two protocols for 
classical data compression with quantum side information, 
one for each of two complementary bases (related by Fourier 
transform), as shown in l32l . Thus, there are two possible 
ways to construct one-shot entanglement distillation from the 
results presented here, and it would be interesting to compare 
with more "fully" quantum approaches, such as l33l . (4). 

Appendix A 
CQ Smoothing 

In this appendix we show that the optimal state for smooth- 
ing the max-entropy of a cq state is itself a cq state. A similar 
result was shown for the min-entropy in Remark 3.2.4 of J9). 

Lemma 3. Let p XB be a cq state and let e > 0. Then there 
exists a cq state p XB £ B e (p XB ) such that 

HLax( X \ B )p = H ma A X \ B )p 

Proof: Observe first that a state p XB is a cq state if and 



only if it has a purification p 



XX' BC _ 



|\&)(*| of the form 



m xx'BC = J2a x \x) x ® \x) x ' ® \cb x ) BC (24) 

x 

where X and X' are isomorphic and where {la;)}^ is an 
orthonormal basis of these spaces. 

By the duality between smooth min- and max-entropy ifPTl . 
it suffices to show that there exists a (subnormalized) vector 
I*) of the form (EH such that p xx ' BC = |vf>)(*| G 
B e (p xx ' BC ) and 



H min( X \ X ' C )p - H mm( X \ X ' C )f 



(25) 



To show this, let p xx ' c e B e (p xx ' c ) such that the min- 
entropy is maximized, i.e., 

H Lin( X \ X >C ) P = Hmm( X \ X 'C)p ■ 

By the definition of the purified distance, there exists a 
purification p xx ' BC = |*)(^| that is e-close to p xx ' BC , 
i.e., |(*|*)| = Vl - e 2 - Using the projector P xx ' = 
E x \x)(x\ x ® \x)(x\ x ', define |*) = (P xx ' ® 
Since (P xx ' ® = |*), we have 



|(*|*)| = \(i>\(P xx ' <g> l sc )|*)| = K#|#>| = Vi-e 2 
and, hence, p xx ' BC g B t (p xx ' BC ). 
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By the definition of min-entropy, there exists a state a 
such that 

x x'c 



x'c 



pXX'c < X1 x ^ 

for A = 2~ H ^^ x<[X ' c ^i > . Applying the projection P xx ' on 
both sides of this operator inequality gives 

Pxx'C = {P XX ' 8> l c )p xx ' c (P xx ' <E> l c ) 
< \{P XX ' ®t C ){l X ®a x ' c ){P xx '®i c ) < \t x m x ' c 

for a x ' G = J2J\ X )( X \ ® l C )<J X ' C (\x)(x\ <g> l c ). This 
immediately implies that 

H min( X \ X 'C)p < iJ min (X|X'C)p 

Since the opposite inequality (>) holds by definition of the 
smooth min-entropy, we have proved d25l l. which concludes 
the proof. ■ 

Appendix B 
Chain Rules 

Here we prove two chain rules which are important for the 
converses of Theorems Q] and [2] 



Lemma 4. For C classical, 



H m , x (A\BC) > H mgx (A\B) - log \C\. 



(26) 



Proof: The general form of the state is p ABC = 
J2 c PcPf B ® l c )( c l C ' where p c is a probability distribution 
and the p AB are normalized states. A purification of p ABC 
is \^)ABCBV = j2 cy ^- c \c,c) cc '\^ c ) ABR , for \^ C ) ABR a 
purification of p AB . By duality, Eq. [4] the stated inequality is 
equivalent to H min {A\RCC% > H min {A\RC% - log |C'| 
(since \C\ = \C'\). We now establish the equivalent form. 
First, make the definitions \(f c ) ABCR = \/Pc'\c) c \ipc) ABR , 
= ^j^abcr and ^ AB CRC = £J C )( C |C ® 
pABCR and j et H min (p AB \a B ) be the min-entropy as defined 
in Eq.[3] but without the maximization over a B . Using Lemma 
3.1.14 of (9) we conclude 

H mi M ARC \° RC ) > H^ ARGG '\° RCC ') H max («P') 



>H min (0 AR ™'\a RCC ') 



log|C|, 



for arbitrary cr RCC ' and a RC = Tr C ' [a RCC '], and where 
the second line follows from the fact that the max-entropy 
is upper bounded by the logarithm of the state-space di- 
mension (alphabet size). If we choose a RCC so that 

#mm( ( ^ flCC 'l^ C ' C '') = Hmin( A \ RCC ')$ then We ° btain 

H nun (A\RC) v >H min ( V ARC \a RC ) 

>H min (A\RCC\-log\C\, 

Now observe that unitarily copying C to C in i^ ABCR 
results in ip ABCRC . This will not affect the conditional 
entropy, so H miB (A\RCC')^ = H mi (A\RC) V . Likewise, 
C can be deleted from C in y ABCRC t producing the state 

^ABRC = J2 cPc \ c )( c \C QTpABR ThuSi H m . ni A\RC% = 

H min (A\RCC')gi, completing the proof. ■ 
Lemma 5. H^ n {AB\C) < H' mirL {A\BC) + log \B\. 



Proof: Start by choosing %j) ABC g B t (t(j ABC ) such 
that H^ABP)* = H min (AB\C)^. From _ the def- 
inition of conditional min-entropy, we have ip ABC < 
2~ H m in( AB \c)^iAB (g, a c t for the p t i ma i a c . Defining 

r] BC = ® fT C , this is equivalent to ip ABC < 

2- H mi„( AB \ c )j \B\1 A ® r] BC . Using the definition once again, 
we can easily see that 2~ ff ""» (AS|c) v< \B\ > 2" H ""» (A|BC) «, 
or equivalently, H mia (AB\C)^ < H min (A\BC)^ + \og\B\. 
Finally, the fact that H min (A\BC)^ < H^ ia (A\BC)^ com- 
pletes the proof. ■ 
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